ReadMe CONTENTS (in order of appearance):

%%%%%%%%%%%%%%%%%%% INSTRUCTIONS FOR EXECUTING ALL ANALYSES (LINE 12 BELOW)
%%%%%%%%%%%%%%%%%%% DIRECTORY OF PRIMARY FILES (LINE 41 BELOW)
%%%%%%%%%%%%%%%%%%% DIRECTORY OF SUPPORTING FILES (LINE 76 BELOW) 
%%%%%%%%%%%%%%%%%%% DIRECTORY OF DATA FILES (LINE 130 BELOW) 
%%%%%%%%%%%%%%%%%%% DIRECTORY OF TABLES AND FIGURES (LINE 148 BELOW) 
%%%%%%%%%%%%%%%%%%% DATA DICTIONARY (LINE 217 BELOW) 



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%             INSTRUCTIONS FOR EXECUTING ALL ANALYSES:            %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
IMPORTANT NOTE: VARIABLE LABELS IN MATLAB ANALYSIS CODE FOLLOW NOTATION CONVENTIONS FROM PREVIOUS VERSIONS OF THE PAPER.  IN PARTICULAR, 
"T" IN MATLAB IS "T" IN THE PAPER;
"Q" IN MATLAB IS "A" IN THE PAPER;
OBJECTS LABELLED "E" (FOR "EFFICIENCY") CORRESPOND TO OBJECTS LABELLED _p (FOR "PRODUCTIVITY") IN THE PAPER (E.G., betaE in MATLAB = \boldsymbol{\beta}_p in the paper)
OBJECTS LABELLED "L" (FOR "LEISURE PREFERENCE") CORRESPOND TO OBJECTS LABELLED _m (FOR "MOTIVATION") IN THE PAPER  (E.G., betaL in MATLAB = \boldsymbol{\beta}_m in the paper)

(0) TO RUN ALL ANALYSES UNDER DEFAULT SETTINGS:  first, extract archived files stored inside BOOTSTRAPS folder using 7-zip (free and downloadable on the internet).  After 7-zip is installed, simply double click on ProcessBootstrapsMSM7.7z.001 and follow the prompt.  This will re-assemble a MATLAB data file, ProcessBootstrapsMSM7.mat (see also note below).  Then, open AAA_File00_Driver.m and click "Run" in the MATLAB editor.  This will compute Stage 3 and 4 estimates under the default specification 4 (in Tables 2, 3, and 6).  Otherwise, to run part of the analysis, or to execute estimation under alternate settings, follow the instructions below.  

(1) TO RUN STAGE 1 OF ESTIMATION: navigate cursor to lines 1-36 of AAA_File00_Driver.m and click "Run Section" in the MATLAB editor.

(2) TO RUN DESCRIPTIVE ANALYSES: navigate cursor to lines 37-204 of AAA_File00_Driver.m and click "Run Section" in the MATLAB editor.

(3) TO RUN STAGE 2 OF ESTIMATION: navigate cursor to lines 205-261 of AAA_File00_Driver.m and click "Run Section" in the MATLAB editor.

(4) TO RUN BOOTSTRAP ESTIMATION FOR STAGES 1 & 2 OF ESTIMATION: first, navigate cursor to lines 262-481 of AAA_File00_Driver.m and click "Run Section" in the MATLAB editor.  Then, change value of variable "RunBootstrap" on line 531 of AAA_File00_Driver.m to "1," and with cursor between lines 482 and 690, click "Run Section" in the MATLAB editor.  BEFORE EXECUTING, BE SURE TO READ NOTES ON LINE 510 OF AAA_File00_Driver.M.  The user should expect this to take a LONG time to complete (weeks or months depending on computer hardware).  For this reason, we also include an alternate way of proceeding directly to Stages 3/4, as follows:

(4 ALTERNATE) TO LOAD PRE-ESTIMATED BOOTSTRAPS AND EXECUTE SETUP FOR STAGES 3 AND 4: first, extract archived files stored inside BOOTSTRAPS folder using 7-zip (free and downloadable on the internet).  After 7-zip is installed, simply double click on ProcessBootstrapsMSM7.7z.001 and follow the prompt.  This will re-assemble a MATLAB data file, ProcessBootstrapsMSM7.mat (see also note below).  Second, make sure "RunBootstrap" on line 530 of AAA_File00_Driver.m is set to "0" and with cursor between lines 482 and 690, click "Run Section" in the MATLAB editor.  

(5) TO RUN STAGE 3 OF ESTIMATION: After executing one of the two alternatives for STEP 4 above, on line 791 of AAA_File00_Driver.m, set "Model" variable to one of the following values: {1,2,3,4.05,4.085}; each value will compute a different specification of Tables 2 & 3 in the paper.  Then, set the value of "RunTobitEstimator" on line 1031 to a value of 1, and with cursor between lines 691 and 2176 of AAA_File00_Driver.m, click "Run Section" in MATLAB (SIDE NOTE: the "FIRST_PASS" variable provides a way of generating feasible start guesses, which is not needed, given stored start guesses in STAGE3 folder).  This will take between 8 and 36 hours to terminate, depending on which specification is selected.  Given the long computation time, we also include an alternate way of proceeding directlyto Stage 4, as follows: 

(5 ALTERNATE) TO LOAD PRE-ESTIMATED STAGE 3 PARAMETERS AND RUN WALD TESTS: After executing one of the two alternatives for STEP 4 above, on line 791 of AAA_File00_Driver.m, set "Model" variable to one of the following values: {1,2,3,4.05,4.085}; each value will compute a different specification of Tables 2 & 3 in the paper.  Then, set the value of "RunTobitEstimator" on line 1031 to a value of 0, and with cursor between lines 691 and 2176 of AAA_File00_Driver.m, click "Run Section" in the MATLAB editor.  

(6) TO RUN STAGE 4 OF ESTIMATION:  After executing one of the two alternatives for STEPs 4 and 5 above, navigate cursor to lines 2177-2795 of AAA_File00_Driver.m and click "Run Section" in the MATLAB editor.



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%                     DIRECTORY OF PRIMARY FILES:                 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
****AAA_File00_Driver.m
		This file calls all subsequent files for estimation and analysis in Stages 1-4.  See instructions above.
		
****AAA_File01_ThetaEDriver.m 
		This file performs Stage-1 estimation as described in Section 4.1, including the test for selectivity bias by Verbeek and Nijman.
		
****AAA_File02_DescriptiveAnalyses.m 
		This file computes descriptive analyses, including Table 1, Figure 1, and Figure OS.2.
		
****AAA_File03_ThetaLDriver.m 
		This is a MATLAB script file that executes Stage-2 estimation.  

****AAA_FILE04_BootstrapDriver_MultiMethod7.m 
		This is a MATLAB script file that runs the bootstrap with a multiple restart routine for each bootstrap sample in order to locate the global optimum for each one.  Due to the long computation time for each bootstrap sample, it is recommended to run bootstraps with parallelized objective function evaluations inside ThetaLEstimatorMSM.m, and to also parallelize bootstraps in batches as well.  See related note on LINE 530 of AAA_File00_Driver.m.  ****NOTE: USING HARDWARE/SOFTWARE AVAILABLE IN 2025, THE USER SHOULD EXPECT THAT RUNNING BOOTSTRAPS WILL TAKE MULTIPLE WEEKS OF CONTINUOUS COMPUTATION TIME AT LEAST.  IT IS NOT RECOMMENDED TO RUN THE ANALYSIS FILES FOR THIS PROJECT WITH FEWER THAN 64GB OF MEMORY AS WELL.  COMPUTATION OF POINT ESTIMATES AND BOOTSTRAPS WAS EXECUTED ON A WINDOWS MACHINE WITH 12th Gen Intel(R) Core(TM) i9-12950HX   2.30 GHz PROCESSOR, AND 64GB OF RAM.  RUNTIME FOR EACH BOOTSTRAP ESTIMATE AVERAGED BETWEEN 20 AND 60 MINUTES.  
		
****ProcessBootstraps.m 
		This is a MATLAB script file that (1) consolidates bootstrap estimate batch files, (2) performs a final sanity check for solution quality using all bootstrap estimates, (3) computes bootstrapped estimates of productivity and motivation traits for each active student, and (4) computes bootstrap confidence intervals for common parameters, and outputs some plots for Stages 1 and 2 of estimation.  NOTE: ProcessBootstraps.m runs in four different steps, depending on the values of 2 variables, "Consolidate" and "LoadVarious."  The value of Consolidate is set to 0 currently, which loads the 400 pre-computed bootstraps used in the paper and contained in ProcessBootstrapsMSM7.mat.  If the user wishes to run a fresh set of bootstrap estimates and then process them to that format, start by setting the value of Consolidate to 1 (line 3) and the value of LoadVarious to 1 (line 9).  Then, execute ProcessBootstraps.m.  This will throw an error message with instructions on what to do next, and so on.  Continue running and following instructions until the process is complete (four steps total).  
		
****AAA_FILE05_InitialHCProductionRF.m 
		This is a MATLAB script file that executes analysis in Appendix C.2, including Specifications (1)-(4) in Table OS.3, and Figure OS.1.  Called by AAA_File00_Driver.m.  IMPORTANT: Although this analysis is no longer part of the main body, this file must be executed BEFORE running AAA_FILE06_HCProductionShortRun4point05.m, which is.  ALSO, before executing this file, be sure that the variable "Model" on line 791 is set to a value of 4.05.
		
****AAA_FILE05_InitialHCProductionRF4point085.m 
		This is a MATLAB script file that executes analysis in Appendix C.2 for Specification (5) in Table OS.3.  Called by AAA_File00_Driver.m.  IMPORTANT: Although this analysis is no longer part of the main body, this file must be executed BEFORE running AAA_FILE06_HCProductionShortRun4point085.m.  ALSO, before executing this file, be sure that the variable "Model" on line 791 is set to a value of 4.085.
		
****AAA_FILE06_HCProductionShortRun4point05.m 
		This is a MATLAB script file that runs the skill production technology analysis for Section 6, including Table 6 (all specifications) and Figure 8.  Called by AAA_File00_Driver.m.  IMPORTANT: before executing this file, be sure that the variable "Model" on line 791 is set to a value of 4.05. 
		
****AAA_FILE06_HCProductionShortRun4point05.m 
		This is a MATLAB script file that runs a fifth specification of the skill production technology analysis in Table 6, but is not reported due to numerical instability (see footnote 46).  Called by AAA_File00_Driver.m.  IMPORTANT: before executing this file, be sure that the variable "Model" on line 791 is set to a value of 4.05.  



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%                DIRECTORY OF SUPPORTING FILES:                %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

****BsplineBasis3.m
		This is a function file that computes cubic B-spline basis functions and/or first and second derivatives.  For Syntax and more information, type "help BsplineBasis2" on the command line.

****BsplineEval3.m
		This is a function file that evaluates a cubic B-spline functions and/or its derivatives.  For Syntax and more information, type "help BsplineBasis2" on the command line.
		
****BsplineCDFLsqFit.m
		This is a function file which accepts as inputs (1) a sample of data, (2) a knot vector, (3) a grid of evaluation points, and (4) sample weights (optional).  It produces a CDF estimate, which enforces monotonicity and boundary conditions (F(x)=0 at lower support bound and F(x)=1 at upper support bound).  The output of the function is a vector of B-spline basis function weights.  
		
****NVtest.m 
		This is a function file that runs a test for selectivity bias in unbalanced panel data, proposed by Verbeek and Nijman (1986).  The null hypothesis of the test is that there is no selectivity bias (i.e., there is no selection on the basis of error terms.  It accepts as input a panel dataset, and produces an output structure containing the coefficient of interest for the test, its standard error, a t-statistic, a 95% confidence interval, a p-value, and degrees of freedom.
		
****ThetaEEstimator.m 
		This is a MATLAB function file that performs Stage-1 estimation, including equation (5) and the heteroskedastic shock distributions.
		
****ThetaLEstimatorMSM.marginal
		This is a MATLAB function file that runs Stage-2 estimation, including cost function and motivation fixed effects.  
		
****InitialguessFq.m
		This is a MATLAB function file that generates feasible initial guess for the B-spline CDFs of labor supply.
		
****InitialguessC.m
		This is a MATLAB function file that generates feasible initial guess for the cost function c(t)
		
****ObjFun_cMSM.m 
		This is a MATLAB function file that computes the objective function for the MSM estimator.  It also does some post-estimation things, like outputing figure 6, SSR, and some output objects.
		
****ASPS.m 
		This is a MATLAB function file; the name stands for "Adaptive Start Point Search" and is used to generate multiple feasible start points for Bootstrap purposes.  Called by AAA_FILE04_BootstrapDriver_MultiMethod7.m.  
		
****feasiblecheck.m 
		This is a MATLAB function file (called by ASPS.m) to check whether a candidate start point satisfies constraints or not.
		
****startvalgeneratorMultiStep.m 
		This is a MATLAB function file (called by ASPS.m) that computes feasible variations on a candidate start point for the cost function vector, "pi".  It does so by increasing/decreasing levels and curvature by various different factors.  It also experiments with increased curvature in the upper tail and decreased curvature in the initial segment of the domain, and vice versa.
		
****startvalgenerator.m 
		This is a MATLAB function file (called by startvalgeneratorMultiStep.m) that computes feasible variations on a candidate start point for the cost function vector, "pi".    It does so by increasing/decreasing levels and curvature by various different factors.  It also experiments with increased curvature in the upper tail and decreased curvature in the initial segment of the domain, and vice versa.
		
****StressTestTobitSolution.m 
		This is a MATLAB script file that executes the multiple re-start routine (outlined in Appendix A.5) for the Stage 3 Tobit Estimator.  Called by AAA_File00_Driver.m.
		
****ParametricBootstrap_ThetaELse.m 
		This is a MATLAB script file that runs a parametric bootstrap routine to compute standard errors of productivity/motivation projections for inactive students (using the Tobit results from Stage 3).  Called by AAA_File00_Driver.m. 
		
****regressHet.m 
		This is a MATLAB function file that performs a test for heteroskedastic errors and executes FGLS regression and Het-robust Wald tests.  Called by AAA_FILE05_InitialHCProductionRF.m, AAA_FILE05_InitialHCProductionRF4point085.m, AAA_FILE06_HCProductionShortRun4point05.m, and AAA_FILE06_HCProductionShortRun4point085.m.  
		
		
		
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%                    DIRECTORY OF DATA FILES:                       %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

****g2.mat: 
			This MATLAB data file contains anonymized student-level data on all participants of the CHLPR study, including active, marginal, and inactive students.  IMPORTANT NOTE:  IN ORDER TO PROTECT CONFIDENTIALITY OF STUDY PARTICIPANTS, TWO GEOGRAPHIC IDENTIFIERS MEASURED AT THE CENSUS BLOCK GROUP LEVEL HAVE BEEN ALTERED FROM THEIR ORIGINAL VALUES, SO THAT ONE CANNOT TELL WHICH CBG EACH PARTICIPANT LIVES IN.  WE DO SO BY ADDING A UNIFORM(-10000,10000) RANDOM NUMBER TO MEAN CBG INCOME, AND ADDING A UNIFORM(-0.025,0.025) RANDOM NUMBER TO THE FRACTION OF CBG MINORS WITH NO PRIVATE HEALTH INSURANCE.  IN BOTH CASES THE UNIFORM PARAMETER USED REPRESENTS UP TO A 10% CHANGE RELATIVE TO THE MEAN OF THE RAW VARIABLE.  THIS ENSURES THAT ONE CANNOT REVERSE ENGINEER EACH PARTICIPANT'S NEIGHBORHOOD, WHILE LARGELY PRESERVING THE UNDERLYING MOMENTS OF THE AGGREGATE DATA THAT WERE USED FOR ANALYSIS.  THE PERTURBED VERSIONS OF MEAN CBG INCOME AND FRACTION OF UNINSURED MINORS HAVE CORRELATIONS OF ABOVE 0.99 WITH THE ORIGINAL VERSIONS, AND THEIR CORRELATION COEFFICIENT IS 98.7% OF THE CORRELATION IN THE RAW VERSIONS OF THE VARIABLES.  NOTE, HOWEVER, THAT DUE TO THIS PERTURBATION IN THE RELEASED CBG VARIABLES, THE USER MAY SEE (VERY CLOSE, BUT) SLIGHTLY DIFFERENT RESULTS IN THE OUTPUTS FOR TABLES 2,3, AND 6, RELATIVE TO NUMBERS REPORTED IN THE PAPER, WHICH WERE COMPUTED WITH THE ORIGINAL VALUES OF THE CBG VARIABLES.  
			
****ProductionData.mat:
			This MATLAB data file contains data on the student-task level for estimation of Stage 1 (but only for active and marginal students, by construction).  It contins two objects.  The first is a dataset, "ProductionData", that has a column for student id ("sid"), a column for cumulative work time across completed learning tasks ("t"), a chronological index ("k", equivalent to "a" in the final notaion of the paper) of each completion, and a final column indicating which contract the child was assigned to ("contract").  The second object is a dataset, "unfinished", containing work on terminal unfinished tasks. The first two columns are the same as ProductionData, but the third column is a binary for whether the student never got paid (i.e., finished <2 total tasks), and the total number of completions by that student.
			
****PointEstimates_MSM7.mat:
			This MATLAB data file contains point estimates of Stage 2 if the user wishes to run later stages of estimation.
			
****ProcessBootstrapsMSM7.mat (uploaded as ProcessBootstrapsMSM7.7z.001 through 009):
			This MATLAB data file contains the data and point estimates (everything that was done through line 405 of AAA_File00_Driver.m) and the 400 processed bootstraps.  The uncompressed version of the file is 35GB, mostly due to the fact that the variable BSResults, which stores bootstrap estimates, is a structure, where each cell contains two structures.  Apparently, this leads to inefficient memory management in MATLAB.  NOTE: due to the large size of this file, it was split into 9 files for upload using 7-zip.  To recombine them, make sure you have 7-zip (free) installed, and then simply double-click the .001 file (the first split file) in 7-Zip and extract — it will automatically merge all parts.  

			

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%                DIRECTORY OF TABLES AND FIGURES:                   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
NOTE: to find the location where the code is for Figure XX, navigate to the indicated file and search on "Figure XX"

****Figure 1:
		AAA_File02_DescriptiveAnalyses.m
		
****Figure 2:
		ProcessBootstraps.m 
		
****Figure 3: 
		LEFT PANEL: ProcessBootstraps.m; RIGHT PANEL:  ThetaEEstimator.m
		
****Figure 4:
		AAA_File00_Driver.m 
		
****Figure 5:
		AAA_File00_Driver.m 
		
****Figure 6:
		ObjFun_cMSM.m 
		
****Figure 7:
		AAA_File00_Driver.m 
		
****Figure 8:
		AAA_FILE06_HCProductionShortRun4point05.m 
		
		
****Figure OS.2:
		ThetaEEstimator.m
		
****Figure OS.3:
		ThetaLEstimatorMSM.m		
		
		
		
****Table 1:
		AAA_File02_DescriptiveAnalyses.m
		
****Table 2:
		AAA_File00_Driver.m 
		
****Table 3:
		AAA_File00_Driver.m 
		
****Table 4:
		AAA_File00_Driver.m
		
****Table 5:
		AAA_File00_Driver.m 
		
****Table 6:
		AAA_FILE06_HCProductionShortRun4point05.m  
		
		
		
****Table OS.1:
		AAA_FILE05_InitialHCProductionRF.m 
		
****Table OS.2:
		AAA_File02_DescriptiveAnalyses.m
		
****Table OS.3:
		AAA_FILE05_InitialHCProductionRF.m 
		

		
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%                        DATA DICTIONARY:                           %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
FOR VARIABLE DEFINITIONS, SEE NOTES INCLUDED IN AAA_File00_Driver IN THE FOLLOWING LOCATIONS: 
LINES 40-54, 
LINES 128-197, 
LINES 635-663, AND 
LINES 808-851.  
SEE ALSO NOTES IN DIRECTORY OF DATA FILES ABOVE.  